Improving deep neural networks based multi-accent Mandarin speech recognition using i-vectors and accent-specific top layer
نویسندگان
چکیده
In this paper, we propose a method that use i-vectors and model adaptation techniques to improve the performance of deep neural networks(DNNs) based multi-accent Mandarin speech recognition. I-vectors which are speaker-specific features have been proved to be effective when used in accent identification. They can be used in company with conventional spectral features as the input features of DNNs to improve the discrimination for different accents. Meanwhile, we adapt DNNs to different accents by using an accent-specific top layer and shared hidden layers. The accent-specific top layer is used to adapt to different accents while the share hidden layers which can be seen as feature extractors can extract discriminative highlevel features between different accents. These two techniques are complementary and can be easily combined together. Our experiments on the 400-hours Intel Accented Mandarin Speech Recognition Corpus show that our proposed method can significantly improve the performance of DNNs-based accented Mandarin speech recognition.
منابع مشابه
Improving Large Vocabulary Accented Mandarin Speech Recognition with Attribute-Based I-Vectors
It has been well-recognized that the accent has a great impact on the ASR of Chinese Mandarin, therefore, how to improve the performance on the accented speech has become a critical issue in this field. The attribute feature has been proven effective on modelling accented speech, resulting in a significantly improved performance in accent recognition. In this paper, we propose an attribute-base...
متن کاملMulti-accent deep neural network acoustic model with accent-specific top layer using the KLD-regularized model adaptation
We propose a multi-accent deep neural network acoustic model with an accent-specific top layer and shared bottom hidden layers. The accent-specific top layer is used to model the distinct accent specific patterns. The shared bottom hidden layers allow maximum knowledge sharing between the native and the accent models. This design is particularly attractive when considering deploying such a syst...
متن کاملImproving native accent identification using deep neural networks
In this paper, we utilize deep neural networks(DNNs) to automatically identify native accents in English and Mandarin when no text, speaker or gender information is available for the speech data. Compared to the Gaussian mixture model(GMM) based conventional methods, the proposed method benefits from two main advantages: first, DNNs are discriminative models which can provide better discriminat...
متن کاملPartial Change Accent Models Speech Recog
Regional accents in Mandarin speech result mostly from partial phone changes due to the interlanguage system of non-native speakers. We propose partial change accent models based on accent-specific units with acoustic model reconstruction for accented Mandarin speech recognition. We use phonological rules of dialectical pronunciations together with likelihood ratio test to model actual accented...
متن کاملMulti-accent Chinese speech recognition
Multiple accents are often present in spontaneous Chinese Mandarin speech as most Chinese have learned Mandarin as a second language. We propose a method to handle multiple accents as well as standard speech in a speaker-independent system by merging auxiliary accent decision trees with standard trees and reconstruct the acoustic model. In our proposed method, tree structures and shape are modi...
متن کامل